mathematical model
A mathematical model for automatic differentiation in machine learning
Automatic differentiation, as implemented today, does not have a simple mathematical model adapted to the needs of modern machine learning. In this work we articulate the relationships between differentiation of programs as implemented in practice, and differentiation of nonsmooth functions. To this end we provide a simple class of functions, a nonsmooth calculus, and show how they apply to stochastic approximation methods. We also evidence the issue of artificial critical points created by algorithmic differentiation and show how usual methods avoid these points with probability one.
Quantum computers turned out to be more useful than expected in 2025
For the past year, I kept bringing the same story to my editor: quantum computers are on the edge of becoming useful for scientific discovery. Of course, that has always been the goal. The idea of using quantum computers to better understand our universe is part of their origin story, and it even featured in a 1981 speech by Richard Feynman. Contemplating the best way to simulate nature, he wrote: "We can give up on our rule about what the computer was, we can say: Let the computer itself be built of quantum mechanical elements which obey quantum mechanical laws." Today, Feynman's vision has been realised by Google, IBM and dozens more companies and academic teams. Their devices are now being used to simulate reality at the quantum level - and here are some highlights.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > United States > Maryland (0.05)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.05)
- Information Technology > Hardware (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence (1.00)
AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library
Kong, Minwei, Qu, Ao, Guo, Xiaotong, Ouyang, Wenbin, Jiang, Chonghe, Zheng, Han, Ma, Yining, Zhuang, Dingyi, Tang, Yuhan, Li, Junyi, Wang, Shenhao, Koutsopoulos, Haris, Wang, Hai, Wu, Cathy, Zhao, Jinhua
Optimization modeling enables critical decisions across industries but remains difficult to automate: informal language must be mapped to precise mathematical formulations and executable solver code. Prior LLM approaches either rely on brittle prompting or costly retraining with limited generalization. We present AlphaOPT, a self-improving experience library that enables an LLM to learn from limited demonstrations (even answers alone, without gold-standard programs) and solver feedback - without annotated reasoning traces or parameter updates. AlphaOPT operates in a continual two-phase cycle: (i) a Library Learning phase that reflects on failed attempts, extracting solver-verified, structured insights as {taxonomy, condition, explanation, example}; and (ii) a Library Evolution phase that diagnoses retrieval misalignments and refines the applicability conditions of stored insights, improving transfer across tasks. This design (1) learns efficiently from limited demonstrations without curated rationales, (2) expands continually without costly retraining by updating the library rather than model weights, and (3) makes knowledge explicit and interpretable for human inspection and intervention. Experiments show that AlphaOPT steadily improves with more data (65% to 72% from 100 to 300 training items) and surpasses the strongest baseline by 7.7% on the out-of-distribution OptiBench dataset when trained only on answers. Code and data are available at: https://github.com/Minw913/AlphaOPT.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
Adaptive tumor growth forecasting via neural & universal ODEs
Subramanian, Kavya, Joshi, Prathamesh Dinesh, Dandekar, Raj Abhijit, Dandekar, Rajat, Panat, Sreedath
Forecasting tumor growth is critical for optimizing treatment. Classical growth models such as the Gompertz and Bertalanffy equations capture general tumor dynamics but may fail to adapt to patient-specific variability, particularly with limited data available. In this study, we leverage Neural Ordinary Differential Equations (Neural ODEs) and Universal Differential Equations (UDEs), two pillars of Scientific Machine Learning (SciML), to construct adaptive tumor growth models capable of learning from experimental data. Using the Gompertz model as a baseline, we replace rigid terms with adaptive neural networks to capture hidden dynamics through robust modeling in the Julia programming language. We use our models to perform forecasting under data constraints and symbolic recovery to transform the learned dynamics into explicit mathematical expressions. Our approach has the potential to improve predictive accuracy, guiding dynamic and effective treatment strategies for improved clinical outcomes.
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.48)
CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling
Tang, Zhengyang, Ye, Zihan, Huang, Chenyu, Huang, Xuhan, Li, Chengpeng, Li, Sihang, Chen, Guanhua, Yan, Ming, Wang, Zizhuo, Zha, Hongyuan, Liu, Dayiheng, Wang, Benyou
Large Reasoning Models (LRMs) have demonstrated strong capabilities in complex multi-step reasoning, opening new opportunities for automating optimization modeling. However, existing domain adaptation methods, originally designed for earlier instruction-tuned models, often fail to exploit the advanced reasoning patterns of modern LRMs -- In particular, we show that direct fine-tuning on traditional \textit{non-reflective} datasets leads to limited gains. To fully leverage LRMs' inherent reasoning abilities, we propose \textbf{CALM} (\textit{Corrective Adaptation with Lightweight Modification}), a framework that progressively refines LRMs within their native reasoning modes for optimization modeling tasks. In CALM, an expert intervener identifies reasoning flaws and provides concise corrective hints, which the LRM incorporates to produce improved reasoning trajectories. These interventions modify fewer than 2.6\% of generated tokens, but generate high-quality data for soft adaptation through supervised fine-tuning. The adapted model is then further improved through reinforcement learning. Building on CALM, we develop \textbf{STORM} (\textit{Smart Thinking Optimization Reasoning Model}), a 4B-parameter LRM that achieves a new state-of-the-art average accuracy of 68.9\% across five popular optimization modeling benchmarks, matching the performance of a 671B LRM. These results demonstrate that dynamic, hint-based data synthesis both preserves and amplifies the native reasoning patterns of modern LRMs, offering a more effective and scalable path towards expert-level performance on challenging optimization modeling tasks.
Algorithms and data structures for automatic precision estimation of neural networks
We describe algorithms and data structures to extend a neural network library with automatic precision estimation for floating point computations. We also discuss conditions to make estimations exact and preserve high computation performance of neural networks training and inference. Numerical experiments show the consequences of significant precision loss for particular values such as inference, gradients and deviations from mathematically predicted behavior. It turns out that almost any neural network accumulates computational inaccuracies. As a result, its behavior does not coincide with predicted by the mathematical model of neural network. This shows that tracking of computational inaccuracies is important for reliability of inference, training and interpretability of results.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > New York County > New York City (0.04)
An N-Plus-1 GPT Agency for Critical Solution of Mechanical Engineering Analysis Problems
Patera, Anthony, Abeyaratne, Rohan
Generative AI, and specifically GPT, can produce a remarkable solution to a mechanical engineering analysis problem - but also, on occasion, a flawed solution. For example, an elementary mechanics problem is solved flawlessly in one GPT instance and incorrectly in a subsequent GPT instance, with a success probability of only 85%. This unreliability renders "out-of-the-box" GPT unsuitable for deployment in education or engineering practice. We introduce an "N-Plus-1" GPT Agency for Initial (Low-Cost) Analysis of mechanical engineering Problem Statements. Agency first launches N instantiations of Agent Solve to yield N independent Proposed Problem Solution Realizations; Agency then invokes Agent Compare to summarize and compare the N Proposed Problem Solution Realizations and to provide a Recommended Problem Solution. We argue from Condorcet's Jury Theorem that, for a Problem Statement characterized by per-Solve success probability greater than 1/2 (and N sufficiently large), the Predominant (Agent Compare) Proposed Problem Solution will, with high probability, correspond to a Correct Proposed Problem Solution. Furthermore, Agent Compare can also incorporate aspects of Secondary (Agent Compare) Proposed Problem Solutions, in particular when the latter represent alternative Problem Statement interpretations - different Mathematical Models - or alternative Mathematical Solution Procedures. Comparisons to Grok Heavy, a commercial multi-agent model, show similarities in design and performance, but also important differences in emphasis: our Agency focuses on transparency and pedagogical value.
Automated Optimization Modeling through Expert-Guided Large Language Model Reasoning
Yang, Beinuo, Zhou, Qishen, Li, Junyi, Su, Chenxing, Hu, Simon
Optimization Modeling (OM) is essential for solving complex decision-making problems. However, the process remains time-consuming and error-prone, heavily relying on domain experts. While Large Language Models (LLMs) show promise in addressing these challenges through their natural language understanding and reasoning capabilities, current approaches face three critical limitations: high benchmark labeling error rates reaching up to 42%, narrow evaluation scope that only considers optimal values, and computational inefficiency due to heavy reliance on multi-agent systems or model fine-tuning. In this work, we first enhance existing datasets through systematic error correction and more comprehensive annotation. Additionally, we introduce LogiOR, a new optimization modeling benchmark from the logistics domain, containing more complex problems with standardized annotations. Furthermore, we present ORThought, a novel framework that leverages expert-level optimization modeling principles through chain-of-thought reasoning to automate the OM process. Through extensive empirical evaluation, we demonstrate that ORThought outperforms existing approaches, including multi-agent frameworks, with particularly significant advantages on complex optimization problems. Finally, we provide a systematic analysis of our method, identifying critical success factors and failure modes, providing valuable insights for future research on LLM-based optimization modeling.
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > China > Hong Kong (0.04)
Large Language Models for Supply Chain Decisions
Simchi-Levi, David, Mellou, Konstantina, Menache, Ishai, Pathuri, Jeevan
Supply Chain Management requires addressing a variety of complex decision-making challenges, from sourcing strategies to planning and execution. Over the last few decades, advances in computation and information technologies have enabled the transition from manual, intuition and experience-based decision-making, into more automated and data-driven decisions using a variety of tools that apply optimization techniques. These techniques use mathematical methods to improve decision-making. Unfortunately, business planners and executives still need to spend considerable time and effort to (i) understand and explain the recommendations coming out of these technologies; (ii) analyze various scenarios and answer what-if questions; and (iii) update the mathematical models used in these tools to reflect current business environments. Addressing these challenges requires involving data science teams and/or the technology providers to explain results or make the necessary changes in the technology and hence significantly slows down decision making. Motivated by the recent advances in Large Language Models (LLMs), we report how this disruptive technology can democratize supply chain technology - namely, facilitate the understanding of tools' outcomes, as well as the interaction with supply chain tools without human-in-the-loop. Specifically, we report how we apply LLMs to address the three challenges described above, thus substantially reducing the time to decision from days and weeks to minutes and hours as well as dramatically increasing planners' and executives' productivity and impact.
Counterfactual optimization for fault prevention in complex wind energy systems
Carrizosa, Emilio, Fischetti, Martina, Haaker, Roshell, Morales, Juan Miguel
Machine Learning models are increasingly used in businesses to detect faults and anomalies in complex systems. In this work, we take this approach a step further: beyond merely detecting anomalies, we aim to identify the optimal control strategy that restores the system to a safe state with minimal disruption. We frame this challenge as a counterfactual problem: given a Machine Learning model that classifies system states as either "good" or "anomalous," our goal is to determine the minimal adjustment to the system's control variables (i.e., its current status) that is necessary to return it to the "good" state. To achieve this, we leverage a mathematical model that finds the optimal counterfactual solution while respecting system-specific constraints. Notably, most counterfactual analysis in the literature focuses on individual cases where a person seeks to alter their status relative to a decision made by a classifier--such as for loan approval or medical diagnosis. Our work addresses a fundamentally different challenge: optimizing counterfactuals for a complex energy system, specifically an offshore wind turbine oil-type transformer. This application not only advances counterfactual optimization in a new domain but also opens avenues for broader research in this area. Our tests on real-world data provided by our industrial partner show that our methodology easily adapts to user preferences and brings savings in the order of 3 million e per year in a typical farm. Introduction Energy systems are becoming increasingly more complex, making it more challenging--and more critical--to detect faults early and develop strategies to mitigate them. In this context, Machine Learning (ML) techniques have become an industry standard for early fault detection [16]. Energy companies can monitor various sensor readings from the turbines and apply ML methods to identify potential issues with components. In this paper, we define a fault (or faulty state) as a condition where a component is in an unsafe status, while an anomaly refers to any irregularity that is not necessarily dangerous. Note that faults are a subset of anomalies. When a fault is detected, a controller is immediately activated to prevent severe damage to the turbine. Machine Learning models can detect anomalies in advance, providing companies with a window of time to intervene before faults occur.
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
- Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
- Europe > Northern Europe (0.04)
- (2 more...)
- Research Report (0.82)
- Overview (0.67)